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We show how sign problems in simulations of many-body systems can manifest themselves in 
the form of heavy-tailed correlator distributions, similar to what is seen in electron propagation 
through disordered media. We propose an alternative statistical approach for extracting ground 
state energies in such systems, illustrating the method with a toy model and with lattice data for 
unitary fermions. 
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I. INTRODUCTION 

One of the most challenging and interesting problems 
in physics is to understand the properties of a system of 
many strongly interacting fermions. Numerical simula- 
tion is an important tool for understanding the ground 
state, and the common approach is to compute the N- 
body correlator C n (t;4>) = (0|*Ar(r)*j v (0)|0)^, where 
^^(0), ^m(t) are interpolating fields which create an 
TV-body state at Euclidean time zero and annihilate it at 
time t, and <f> is a stochastic field responsible for fermion 
interactions. The field <j> could be the dynamical gluon 
field in the case of QCD, for example, or an auxiliary 
field to induce short-range interactions. For large r the 
averaged correlator asymptotically approaches 

(Ctf(r,0))~.Ze- T *>W (1) 

where Eq{N) is the ground state energy of the system 
and \[Z is the amplitude for "J to create the ground 
state. Therefore if one computes — - In CV(r), where 
CV( T ) = jj Gm{t, (pi) is a sample mean computed on 
an ensemble of TV statistically independent (j> fields, one 
expects to see a "plateau" at large r whose height yields 
the ground state energy Eq(N). Excited state energies 
and response of the ground state to probes can also be 
computed by variations of this technique. 

The computation of — - In Cm (t) can be problematic, 
however: it might be excessively noisy, or it may drift 
with r and never find a plateau. We wish to address these 
problems here, defining the former as a "noise" problem, 
and the latter as an "overlap" problem, both of which 
can be related to the sign problem encountered in lattice 
simulations at nonzero chemical potential. In particu- 
lar, referring to recent lattice simulations by the present 
authors of large numbers of unitary fermions, we show 
that the problems encountered can be manifestations of 
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heavy-tailed distributions for Cm (t, <j>) which make com- 
puting ln(Cjv) very difficult, and that the ideal estimator 
for this quantity might not simply be InCjv, as is com- 
monly used. We find here that a cumulant expansion in 
the log of the correlator is a more efficient estimator, for 
example. More generally, we suggest that a study of the 
statistics of systems exhibiting noise or an overlap prob- 
lem might be exploited to greatly facilitate the extraction 
of useful physics from numerical simulations. 

II. NOISE, AND THE PHYSICAL SPECTRUM 

The sign problem encountered in ./V-particle simula- 
tions does not arise simply because of Fermi statistics; if 
that were the only obstacle one could construct Cat as 
an N x N Slater determinant of one-body propagators, 
with a computational difficulty of computing the determi- 
nant scaling only as N 3 . In contrast, the sign problems 
commonly encountered, such as with QCD at nonzero 
chemical potential, entail computational difficulty which 
grows exponentially with particle number; furthermore, 
serious sign problems can occur in bosonic systems as 
well. Instead, sign problems appear when there are mul- 
tiparticle states for which the energy/constituent is lower 
than for the states one wants to study. For example, if 
(Ca) ~ e~ MAT is the expectation of a 3^4 quark cor- 
relator in QCD for a nucleus of atomic number A and 
mass Ma, the variance in the sample mean Ca can be 
estimated as 

^ ~ (C\C A ) - (C\) (C A ) ~ ^e~ 3Am ^ (2) 

for sample size Af. Since C'a corresponds to 3A quark 
propagators and C\ to 3A anti-quark propagators, the 
variance is dominated by the state with 3A pions and 
a falls off with r much more slowly than the signal one 
is looking for, (Ca), since \m^A <C Ma- This "Lepage 
analysis" [2] suggests there is a noise problem and that 
it arises because in a background gluon field each quark 
propagator is uncorrelated with any other and doesn't 
"know" whether it is to be contained in a light pion or a 
heavy nucleon. This suggests a picture where each corre- 
lator Ca (t, A) in a particular background gauge field A 
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c log(c) 

FIG. 1: Histograms for distributions of c = Cjv(t0) and ln(c) for N = 4 unitary fermions at several times r, taken from 
Ref. [T]. Curves fitting ln(c) are Gaussian, implying that c is approximately log-normal distributed, with a 2 increasing with 
time. 



roughly equals e - 3 - 4 / 2m " T j an d the exponentially smaller 
value expected for (Ca) only arises from subsequent can- 
cellations while averaging over gauge fields. A very sim- 
ilar analysis applies to QCD with nonzero chemical po- 
tential [3J H] . This would be a reasonable picture if the 
distribution of Ca (t, A) over the ensemble of gauge fields 
was normal, with mean e~ MAT and variance e _3j4m,r ' r , 
with large fluctuations concealing an exponentially small 
signal. There are general arguments that suggest this 
is incorrect, however, and that the distribution of many- 
fermion correlation functions will be heavy-tailed and ex- 
tremely non-Gaussian, a result we also find from explicit 
simulations of unitary fermions. In the latter case we 
show that better understanding the nature of the noise 
can help devise an efficient strategy for extracting a sig- 
nal; it is plausible that similar techniques could be more 
widely applicable to noisy systems. 



III. A MEAN FIELD DESCRIPTION 

Nonrelativistic fermions with strong short-range inter- 
actions tuned to a conformal fixed point where the phase 
shift satisfies 8{k) = ir/2 for all k are called "unitary 
fermions" . This nonrelativistic conformal field theory is 
interesting to study both for its simplicity and universal- 
ity, its challenges for many-body theory, and because it 
can be realized and studied experimentally using trapped 
atoms tuned to a Feshbach resonance. It is also an ideal 
theory for studying fermion sign problems on the lattice, 
being much simpler and faster to simulate than QCD. At 
its most basic, the lattice action is the obvious discretiza- 
tion of the Euclidean Lagrangian [5] 

(d T - V 2 /2M )ip -\m 2 4> 2 + <j)^iP (3) 

where 4> is a nonpropagating auxiliary field with m 2 
tuned to a critical value m 2 , and tp is a spin | fermion 
with mass M; a more sophisticated action tuned to re- 
duce discretization errors was recently presented in [6]. 
A simulation of this theory reveals a distribution for 
iV-body correlators Cjy (t, <fi) which is increasingly non- 
Gaussian at late r; in fact, it is InCjv which appears to 



be roughly normally distributed, as shown in Fig. [T] so 
that Cn(t 7 4>) is roughly log- normal distributed with an 
increasingly large a and long tail at late time. 

The appearance of a heavy-tailed distribution should 
not be surprising, since the system is similar to the prob- 
lem of electron propagation in disordered media, where 
heavy-tailed distributions are ubiquitous in the vicinity 
of the Anderson localization transition. For example, it 
is found that for physical quantities such as the current 
relaxation time or normalized local density of states, the 
distribution function P(z) scales as exp(— Cd ln d z). A 
particularly simple way to derive these results is to use 
the optimal fluctuation method of Ref. [7], which is a 
mean field approach. We can adapt these methods to the 
current problem, defining the variable Y = In CV (t, 4>) 
and computing its probability distribution P(y) as 



P(y) = N J Dct>e- s *5(Y(T,ct>)-y) 



(4) 



where S4, = J <£x\4> 2 and S = S4 - itQn C n (t, 4>) - y). 
Using the PDS subtraction scheme [S] we have m 2 = 
MX/Att, where the renormalization scale A is taken to be 
the physical momentum scale in the problem — in this 
case A = kp = (3ir 2 N/V) 1/3 , N/2 being the number of 
fermions with a single spin orientation. We proceed now 
to evaluate this integral using a mean field expansion; it 
is not evident that there is a small parameter to justify 
this expansion, but the leading order result is illuminat- 
ing and fits the numerical data well. We expand about 
4>{x) = 4>q, t = to, and use the fact that for large r the 
n th functional derivative of In Cjy (t, 4>) with respect to 
4>{x) equals the the 1-loop Feynman diagram with n in- 
sertions of ip'ij) in the presence of a chemical potential 
[i = k 2 F j (2M) . The equations for <fio and to are given by 
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where Eq(N) = ZNEp/h is the total energy of N free 
degenerate fermions (N/2 of each spin), and Z is the 
overlap of the source and sink with the free fermion state. 
The leading term in the mean field expansion for P(y) 
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can therefore be expressed as P(y) oc exp 



with 



y =]hZ-tE (N) , a 2 
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This describes a log-normal distribution for the N- 
fermion propagator Cn(t,4>), with both mean and vari- 
ance growing with time in units of the energy of N free 
degenerate fermions. In Fig. [2] we plot the quantities 
— and i ^S— as a function of N obtained from cor- 
relator distribution data for unitary fermions at late r, 
and find that the gross features of the results are com- 
patible with the mean field estimates of unity and 40/97T 
obtained from eq. ([6|. 



IV. A TOY MODEL 

It would be useful to devise an algorithm to reliably 
estimate energies without having to exhaustively sample 
the long tail of the correlator distribution, yet without 
making incorrect assumptions about the exact functional 
form of that tail. An approach we suggest here is to ex- 
ploit the general relationship between stochastic variables 
I and 7 = lnl: 



(7) 



where n n is the n cumulant of Y. This relation can 
be proved by noting that the generating function for the 
K n is ln0y(i) where = (e Yt ) = (A*) is the mo- 

ment generating function for Y , and evaluating at t = 1, 
assumed to be within the radius of convergence. The mo- 
tivation for investigating eq. ^ is that if the distribution 
P(X) were exactly log-normal, the above sum would end 
after the second term, as n n >2 would all vanish; therefore 



1.8 
1.6 
1.4 
1.2 
1.0 
0.8 




40/9n 



-E - l dy/dT\ r . 
Eo-'do- 2 /^. 



10 20 30 40 50 60 

N 

FIG. 2: The quantities - A. |£ an d A- ^ as a function of N 
for unitary fermions at late times on a lattice of size L = 10, 
compared to mean field prediction eq. Q (dashed lines). 



TABLE I: £ determined from 250 blocks of 50,000 configu- 
rations each for the model eq. (J8j) with r = 1000, g = 1/2. 



Method 



£ 



stat. error 



syst. error 



conventional 

Kn<2 
K n <3 
^n<4 
^n<5 
^n<6 



0.014932 
-0.002159 
-0.000412 
-0.000647 
-0.001794 

0.010943 



0.002485 
0.000304 
0.001618 
0.008379 
0.037561 
0.147739 



-0.002165 
-0.000324 
0.000050 
3.34 x 10" 6 
-1.22 x 10~ 6 



by replacing the n n by sampled cumulants and truncat- 
ing the sum at finite order, one might hope to have a 
reliable estimator for ln(A) provided P(X) was nearly 
log-normal, in the sense that the k„ fall off rapidly for 
n > 2. 

Distributions with log-normal-like tails arise naturally 
in products of stochastic variables. The propagator 
Cn (t, <j)) for unitary fermions can be expressed in a trans- 
fer matrix formalism as the product of a t matrices - 
one per time hop — each of which is the direct product of 
N V xV matrices of the form e~ K / 2 (l+gtp)e~ K / 2 , where 
if is a constant matrix (the spatial kinetic operator) , (p is 
a random diagonal matrix with O(l) entries correspond- 
ing to stochastic cj> fields living on the time links, and g is 
a coupling constant (identified with 1 /m 2 in Eq. [3| that 
has been tuned to a particular critical value that is 0(1). 
Unfortunately, little seems to be known about products 
of random matrices, beyond the study in [5] which deals 
with large products of weakly random matrices. There- 
fore we analyze instead a toy model where we define a 
"correlator" C r as a product of random numbers, and an 
"energy" £ = lim T _ ! . 0O £ T where: 



T i 

C T = n( 1+ 9<Pi) , Sr = —HCr) 



(8) 



where < g < 1 and the ipi are independent and iden- 
tically distributed random numbers with a uniform dis- 
tribution on the interval [— 1, 1]. The exact value for the 
energy is obviously £ T = for any r since the statistical 
average of the correlator is (C T ) = 1. The cumulants of 
the variable Y = ln(C T ) are given by 
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for n > 2; for small g one finds that the K n rapidly de- 
crease as n increases for g < 1. Table [I] shows how the 
systematic error in eq. ^ when truncated at n = n max , 
converges to the exact answer £ T = as a function of 
n max for g — 1/2, and shows that even though the distri- 
bution is not log-normal (k„>2 ^ 0) the convergence is 
rapid. 

In Fig. [3] we show the results of a simulation where 



we compute £ T for g 



and t — 1, . . . , 1000. At each 
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FIG. 3: Simulation of the energy £ T for the model eq. Q 
with g = |. The exact answer is £ T = (black line); exact 
values of eq. |7f truncated at order n = 2, 3 are indicated. 




FIG. 4: Energy for 50 unitary fermions in a harmonic trap, 
10 6 configurations; fits performed using expansion eq. Q up 
to order n over the time interval 45-60 (n=2,3) and 13-60 
(n=4,5,6). Inset: conventional effective mass. 



value of r we independently generated an ensemble of 
values for C T of size N — 50, 000. From that ensemble 
we computed £ T by (i) using the conventional estimator 
£ T = — — In C T (blue) , which shows a striking system- 
atic error for r > 50, and statistical noise increasing up 
to t ~ 500 but decreasing beyond that; (ii) using eq. 
([7]) truncated at n — 2 using conventional estimators for 
the K n (green) , showing a r-independent systematic error 
with smaller but slowly growing statistical error; (iii) eq. 
([7]) truncated at n — 3 (red) with a negligible constant 
systematic error but a larger statistical error, growing 
with t. Evidently, one trades systematic error for sta- 
tistical error by truncating eq. |7| at increasingly large 



Table |T] displays results of a simulation of 1.25 x 10 7 
4> configurations blocked into 250 blocks of 50,000 each, 
for the model eq. ^ at r = 1000 and g = 1/2. For 
each case we give the mean and the square root of the 
variance; for the truncated cumulant expansion we also 
give the theoretical systematic error from truncating eq. 
([7]) using our analytic expressions for n n . These numbers 
show how the conventional method gives a wrong answer 
with deceptively small statistical error. One sees again 
the trade of systematic error for statistical error as one 
increases the order n max where one truncates the sum in 
eq. ([7]). Table [I] suggests the place to stop for the smallest 
combined error is at n max = 3, justified by noting that 
the n m ax — 4 result with statistical errors encompasses 
the n m ax — 3 result; we suggest this as a practical al- 
gorithm for determining where to truncate the cumulant 
expansion in general. Fig. [4] shows how this works in a 
real simulation for 50 trapped unitary fermions [T]. 



V. DISCUSSION 

Heavy-tail distributions are likely to be ubiquitous in 
TV-body simulations, and perhaps even in other types of 
noisy calculations. With such distributions theoretical 
statistical means can deviate wildly from sample means 
for any realizable sample size and render conventional 
estimates of expected fluctuations irrelevant. We have 
shown that there are more efficient estimators for ground 
state energies using the cumulants of the log of the corre- 
lator instead of the conventional effective mass, at least 
for positive correlators. This method is presumably only 
effective for nonpositive data when when the heavy-tail is 
asymmetric. It may be useful to think of this procedure 
in a renormalization group language, where the higher 
cumulants behave like irrelevant operators affecting the 
flow toward a log-normal distribution. 
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